Search CORE

24 research outputs found

Adaptive Resource Scheduling for Energy Efficient QRD Processor with DVFS

Author: Liu Liang
Liu Yangxurui
Prabhu Hemanth
Öwall Viktor
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents an energy efficient adaptive QR decomposition scheme for Long Term Evolution Advance (LTE-A) downlink system. The proposed scheme provides a performance robustness to fluctuating wireless channels while maintaining lower workload on a reconfigurable hardware. A statistic based algorithm-switching strategy is employed in the scheme to achieve workload reduction and stable computing resource requirement for QR decomposition. With run time resource allocation, computing resources are assigned to highest performance gain segments to reduce performance loss. By utilizing the dynamic voltage and frequency scaling (DVFS) technique, we further exploit the potential of power saving in various workload situation while maintaining fixed throughput. The proposed technique brings power reduction upto 57.8% in EVA-5 scenario and 24.4% with a maximum SNR loss of 1 dB in EVA-70 scenario, when mapped on a coarse grain reconfigurable vector-based platform

Lund University Publications

Crossref

Energy Efficient SQRD Processor for LTE-A using a Group-sort Update Scheme

Author: Edfors Ove
Liu Liang
Prabhu Hemanth
Zhang Chenxin
Öwall Viktor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This paper presents an energy-efficient sorted QR decomposition (SQRD) processor for 3GPP LTE-Advanced (LTE-A) systems. The processor adopts a hybrid decomposition scheme to reduce computational complexity and provides a wide-range of performance complexity trade-offs. Based on the energy distribution of spatial channels, it switches between the brute-force SQRD and a low-complexity group-sort QR-update strategy, which is proposed in this work to effectively utilize the LTE-A pilot pattern. As a proof of concept, a run-time reconfigurable vector processor is developed to efficiently implement this adaptive-switching QR decomposition algorithm. In a 65nm CMOS technology, the proposed SQRD processor occupies 0.71 mm2 core area and has a throughput of up to 100MQRD/s. Compared to the brute-force approach, an energy reduction of 5~33% is achieved

Lund University Publications

Crossref

Science with the Daksha High Energy Transients Mission

Author: Adalja Hitesh Kumar L.
Anupama G C
Bala Suman
Banerjee Smaranika
Basu Judhajeet
Belatikar Hrishikesh
Beniamini Paz
Bhaganagare Mahesh
Bhalerao Varun
Bhaskar Ankush
Bhattacharjee Soumyadeep
Bhattacharya Dipankar
Bose Sukanta
Cenko Brad
Chanda Mehul Vijay
Dewangan Gulab
Dixit Vishal
Dutta Anirban
Gawade Priyanka
Ghodgaonkar Abhijeet
Goyal Shiv Kumar
Gunasekaran Suresh
Guruprasad P J
Hemanth Manikantan
Hotokezaka Kenta
Iyyani Shabnam
Kasliwal Mansi
Koyande Jayprakash G.
Kulkarni Salil
Kutty APK
Ladiya Tinkal
Marla Deepak
Mate Sujay
Mehla Advait
Mithun N. P. S.
More Surhud
Mote Rakesh
Mukherjee Dipanjan
Narang Sanjoli
Narendranath Shyama
Nema Ayush
Nimbalkar Sudhanshu
Nissanke Samaya
Pai Archana
Palit Sourav
Patel Arpit
Patel Jinaykumar
Paul Biswajit
Pradeep Priya
Ramachandran Prabhu
Rana Vikram
Roy Kinjal
Saiguhan B. S. Bharath
Saji Joseph
Saleem M.
Saraogi Divita
Sastry Parth
Sawant Disha
Shanmugam M.
Sharma Piyush
Shetye Amit
Singh Nishant
Singh Shreeya
Singhal Akshat
Sreekumar S.
Sridhar Srividhya
Srinivasan Rahul
Tallur Siddharth
Tendulkar Shriharsh
Tiwari Neeraj K.
Vadawale Santosh
Vadladi Amrutha Lakshmi
Vaishnava C. S.
Vishwakarma Sandeep
Waratkar Gaurav
Publication venue
Publication date: 22/11/2022
Field of study

We present the science case for the proposed Daksha high energy transients mission. Daksha will comprise of two satellites covering the entire sky from 1~keV to

>1

~MeV. The primary objectives of the mission are to discover and characterize electromagnetic counterparts to gravitational wave source; and to study Gamma Ray Bursts (GRBs). Daksha is a versatile all-sky monitor that can address a wide variety of science cases. With its broadband spectral response, high sensitivity, and continuous all-sky coverage, it will discover fainter and rarer sources than any other existing or proposed mission. Daksha can make key strides in GRB research with polarization studies, prompt soft spectroscopy, and fine time-resolved spectral studies. Daksha will provide continuous monitoring of X-ray pulsars. It will detect magnetar outbursts and high energy counterparts to Fast Radio Bursts. Using Earth occultation to measure source fluxes, the two satellites together will obtain daily flux measurements of bright hard X-ray sources including active galactic nuclei, X-ray binaries, and slow transients like Novae. Correlation studies between the two satellites can be used to probe primordial black holes through lensing. Daksha will have a set of detectors continuously pointing towards the Sun, providing excellent hard X-ray monitoring data. Closer to home, the high sensitivity and time resolution of Daksha can be leveraged for the characterization of Terrestrial Gamma-ray Flashes.Comment: 19 pages, 7 figures. Submitted to ApJ. More details about the mission at https://www.dakshasat.in

arXiv.org e-Print Archive

Hardware Implementation of Baseband Processing for Massive MIMO

Author: Prabhu Hemanth
Publication venue: The Department of Electrical and Information Technology
Publication date: 07/03/2017
Field of study

In the near future, the number of connected mobile devices and data-rates are expected to dramatically increase. Demands exceed the capability of the currently deployed (4G) wireless communication systems. Development of 5G systems is aiming for higher data-rates, better coverage, backward compatibility, and conforming with “green communication” to lower energy consumption. Massive Multiple-Input Multiple-Output (MIMO) is a technology with the potential to fulfill these requirements. In massive MIMO systems, base stations are equipped with a very large number of antennas compared to 4G systems, serving a relatively low number of users simultaneously in the same frequency and time resource. Exploiting the high spatial degrees-of-freedom allows for aggressive spatial multiplexing, resulting in high data-rates without increasing the spectrum. More importantly, achieving high array gains and eliminating inter-user interference results in simpler mobile terminals.These advantages of massive MIMO requires handling a large number of antennas efficiently, by performing baseband signal processing. Compared to small-scale MIMO base stations, the processing can be much more computationally intensive, in particular considering the large dimensions of the matrices. In addition to computational complexity, meeting latency requirements is also crucial. Another aspect is the power consumption of the baseband processing. Typically, major contributors of power consumption are poweramplifiers and analog components, however, in massive MIMO, the transmit power at each antenna can be lowered drastically (by the square of the number of antennas). Thus, the power consumption from the baseband processing becomes more significant in relation to other contributions. This puts forward the main challenge tackled in this thesis, i.e., how to implement low latency baseband signal processing modules with high hardware and energy efficiency.The focus of this thesis has been on co-optimization of algorithms and hardware implementations, to meet the aforementioned challenges/requirements. Algorithm optimization is performed to lower computational complexity, e.g., large scale matrix operations, and also on the system-level to relax constraints on analog/RF components to lower cost and improve efficiency. These optimizations were evaluated by taking into consideration the hardware cost and device level parameters. To this end, a massive MIMO central baseband pre-coding/detection chip was fabricated in 28 nm FD-SOI CMOS technology and measured. The algorithm and hardware co-optimization resulted in the highest reported pre-coding area and energy efficiency of 34.1QRD/s/gate and 6.56nJ/QRD, respectively. For detection, compared to small scale MIMO systems, massive MIMO with linear schemes provided superior performance, with area and energy efficiency of 2.02Mb/s/kGE and 60 pJ/b.The array and spatial multiplexing gains in massive MIMO, combined with high hardware efficiency and schemes to lower constraints on RF/analog components, makes it extremely promising for future deployments

Lund University Publications

A Cholesky decomposition based massive MIMO uplink detector with adaptive interpolation

Author: Edfors Ove
Gangarajaiah Rakesh
Liu Liang
Prabhu Hemanth
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2017
Field of study

An adaptive uplink detection scheme for a Massive MIMO (MaMi) base station serving up to 16 users is presented. Considering user distribution in a cell, selective matched filtering (MF) is proposed for non-interference limited users and a Cholesky decomposition (CD) based zero-forcing (ZF) detector is implemented for the remaining users. Channel conditions such as coherence bandwidth are exploited to lower computational complexity by interpolating CD outputs. Performance evaluations on measured MaMi channels indicate a reduction in computation count by 60 times with a less than 1 dB loss at an uncoded bit error rate of 10-3. For the CD, a reconfigurable processor optimized for 8×8 matrices with block decomposition extension to support up to 16×16 matrices is presented. Circuit level optimizations in 28 nm FD-SOI resulted in an energy of 1.4 nJ/CD at 400 MHz, and post-layout simulations indicate a 50% reduction in power dissipation when operating with the proposed interpolation based detection scheme compared to traditional ZF detection

Lund University Publications

Crossref

3.6 A 60pJ/b 300Mb/s 128×8 Massive MIMO precoder-detector in 28nm FD-SOI

Author: Edfors Ove
Liu Liang
Prabhu Hemanth
Rodrigues Joachim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Further exploitation of the spatial domain, as in Massive MIMO (MaMi) systems, is imperative to meet future communication requirements [1]. Up-scaling of conventional 4×4 small-scale MIMO implementations to MaMi is prohibitive in-terms of flexibility, as well as area and power cost. This work discloses a 1.1mm2 128×8 MaMi baseband chip, achieving up to 12dB array and 2× spatial multiplexing gains. The area cost compared to previous state-of-the-art MIMO implementations [2-3], is reduced by 53% and 17% for up- and down-link, respectively. Algorithm optimizations and a highly flexible framework were evaluated on real measured channels. Extensive hardware time multiplexing lowered area cost, and leveraging on flexible FD-SOI body bias and clock gating resulted in an energy efficiency of 6.56nJ/QRD and 60pJ/b at 300Mb/s detection rate

Lund University Publications

A 1070 pJ/b 169 Mb/s Quad-core Digital Baseband SoC for Distributed and Cooperative Massive MIMO in 28 nm FD-SOI

Author: Edfors Ove
Liu Liang
Prabhu Hemanth
Sheikh Farhana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/06/2021
Field of study

A 2.2 mm2 full digital baseband SoC with four heterogeneous cores for 128-node 8-users distributed massive MIMO is presented. Two specialized DSPs perform rapid over-the-air synchronization within 0.1ms. A highly optimized 8-complex lane MIMO vector processor provides 4x hardware efficiency improvement over general-purpose processors. Circuit optimizations and the use of body-bias result in 1070 pJ/b measured energy at 169 Mb/s detection rate

Lund University Publications

Approximative Matrix Inverse Computations for Very-large MIMO and Applications to Linear Pre-coding Systems

Author: Edfors Ove
Prabhu Hemanth
Rodrigues Joachim
Rusek Fredrik
Publication venue: IEEE - Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2013
Field of study

In very-large multiple-input multiple-output (MIMO) systems, the BS (base station) is equipped with very large number of antennas as compared to previously considered systems. There are various advantages of increasing the number of antennas, and some schemes would require handling large matrices for joint processing (pre-coding) at the base station. The dirty paper coding (DPC) is an optimal pre-coding scheme and has a very high complexity. However with increasing number of BS antennas linear pre-coding performance tends to that of the optimal DPC. Although linear pre-coding is less complex than DPC, there is a need to compute pseudo inverses of large matrices. In this paper we present a low complexity approximation of down-link Zero Forcing linear pre-coding for very-large multi-user MIMO systems. Approximation using a Neumann series expansion is opted for inversion of matrices over traditional exact computations, by making use of special properties of the matrices, thereby reducing the cost of hardware. With this approximation of linear pre-coding, we can significantly reduce the computational complexity for large enough systems, i.e., where we have enough BS antenna elements. For the investigated case of 8 users, we obtain 90% of the full ZF sum rate, with lower computational complexity, when the number of BS antennas per user is about 20 or more

CiteSeerX

Lund University Publications

High Throughput Constant Envelope Pre-coder for Massive MIMO Systems

Author: Edfors Ove
Prabhu Hemanth
Rodrigues Joachim
Rusek Fredrik
Publication venue: IEEE - Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2015
Field of study

This study describes a high throughput constant envelope (CE) pre-coder for Massive MIMO systems. A large number of antennas (M), in the order of 100s, serve a relatively small number of users (K) simultaneously. The stringent amplitude constraint (only phase changes) in the CE scheme is motivated by the use of highly power-efficient non-linear RF power amplifiers. We propose a scheme that computes the CE signals to be transmitted based on box-constrained regression (coordinatedescent),with an O(2MK) complexity per iteration per user symbol. A highly scalable systolic architecture is implemented, where M Processing Elements (PEs) perform the pre-coding for a system with up to K = 16 users. This systolic architecture results in a very high throughput of 500 Msamples/sec (at 500 MHz clock rate) with a gate count of 14 K per PE in 65 nm technology

Lund University Publications

Crossref

Algorithm and Hardware Aspects of Pre-coding in Massive MIMO Systems

Author: Edfors Ove
Liu Liang
Prabhu Hemanth
Rodrigues Joachim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Massive Multiple-Input Multiple-Output (MIMO) systems have been shown to improve both spectral and energy efficiency one or more orders of magnitude by efficiently exploiting the spatial domain. Low-cost RF chains can be employed to reduce the Base Station (BS) cost, however this may require additional baseband processing to handle induced distortions due to the hardware impairments. In this article the reduction of Peak-to-Average power Ratio (PAR) of the transmitted signals and IQ imbalance in the mixer are analyzed for the down-link. We analyze various pre-coding schemes and estimate the required processing energy per transmitted information bit. Simulation on gate-level show that the energy cost of performing pre-coding and tackling of hardware impairments range from very low to reasonable, compared to the processing necessary in a system without impairments

Lund University Publications